Group Members:-
Hisham
Nandana
Devika
The source of the data, Horticulture Statistics at a Glance 2018, contains a number of datasets pertaining to the horticulture industry. The section 7.1 that provides the time series of the All-India Area, Production and yield of important horticulture crops is the one that we have selected. The area is measured in 1000 Ha, production is measured in 1000 MT which is converted to 1000 sq km for easy understanding, and productivity is measured in MT/Ha. There are 11 tables in this section, numbered 7.1.1 through 7.1.11. The area, production, and productivity of the main horticulture crops, such as fruits, vegetables, flowers, and aromatic, plantation, and spices, are shown in table 7.1.1 for 1992 and then from 2001 to 2018.
(Table 7.1.1)
The data set has been taken from “Agriculture Statistics at a Glance 2018. The table 7.1.1 shows the area, production and productivity of horticulture crops such as fruits, vegetables, flowers & aromatic, plantation crops and spices in 1992 and then from 2002-2018. The area has been represented in ’000 Ha units, production in ’000MT and productivity in MT/Ha.
Numerical summary includes the average, maximum as well as minimum values for the area, production and productivity for each type of crops such as fruits, vegetables, flowers & aromatic, plantation crops and spices..
horti_sum<-horti_data_clean%>%
group_by(crop)%>%
summarize(max_area = max(area_in_1k_ha),
max_production = max(production_in_1k_mt),
max_productivity = max(productivity_in_mt_per_ha),
min_area = min(area_in_1k_ha),
min_production = min(production_in_1k_mt),
min_productivity = min(productivity_in_mt_per_ha),
mean_area = mean(area_in_1k_ha),
mean_production = mean(production_in_1k_mt),
mean_productivity = mean(productivity_in_mt_per_ha))
Visual summary helps us to under the trend in area, production and productivity for each type of crops over the years as well as the crop-wise total area, production and productivity.
The plots showing the total area, production and productivity for each crop and these plots will help us to make conclusion regarding which crop has go the highest as well as the lowest total area, production and productivity from 1992 till 2018.
Over the years area as well as production tend to show an increasing trend. Among that the vegetables has more area and production over the time. However, there may be many other factors that affects the area and production of each crop that might have affected the area and production in a negative or positive way.
(Table 7.1.3 - 7.1.11 merged)
names(horticulture_data)
## [1] "id" "year"
## [3] "crop_name" "crop_type"
## [5] "area_in_1k_ha" "area_in_1k_sq_km"
## [7] "production_in_1k_mt" "productivity_in_mt_per_ha"
unique(horticulture_data$crop_name)
## [1] "Lime/Lemon" "Orange" "Mosambi" "Apple" "Banana"
## [6] "Grapes" "Guava" "Litchi" "Mango" "Papaya"
## [11] "Pineapple" "Sapota" "Brinjal" "Cabbage" "Cauliflower"
## [16] "Okra" "Onion" "Peas" "Tomato" "Potato"
## [21] "Sweet Potato" "Tapioca" "Arecanut" "Cashewnut" "Coconut"
## [26] "SPICES"
unique(horticulture_data$crop_type)
## [1] "FRUIT" "VEGETABLE" "PLANTATION CROPS" "SPICES"
unique(horticulture_data$year)
## [1] 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
## [16] 2016 2017 2018 1992
str(horticulture_data)
## tibble [471 × 8] (S3: tbl_df/tbl/data.frame)
## $ id : num [1:471] 1 2 3 4 5 6 7 8 9 10 ...
## $ year : num [1:471] 2001 2002 2003 2004 2005 ...
## $ crop_name : chr [1:471] "Lime/Lemon" "Lime/Lemon" "Lime/Lemon" "Lime/Lemon" ...
## $ crop_type : chr [1:471] "FRUIT" "FRUIT" "FRUIT" "FRUIT" ...
## $ area_in_1k_ha : num [1:471] 164.2 161.3 146.2 167.8 78.9 ...
## $ area_in_1k_sq_km : num [1:471] 1.642 1.613 1.462 1.678 0.789 ...
## $ production_in_1k_mt : num [1:471] 1377 1414 1440 1493 1033 ...
## $ productivity_in_mt_per_ha: num [1:471] 8.4 8.8 9.8 8.9 13.1 8 7.8 8.3 8.1 8.9 ...
summary(horticulture_data)
## id year crop_name crop_type
## Min. : 1.0 Min. :1992 Length:471 Length:471
## 1st Qu.:118.5 1st Qu.:2005 Class :character Class :character
## Median :236.0 Median :2009 Mode :character Mode :character
## Mean :236.0 Mean :2009
## 3rd Qu.:353.5 3rd Qu.:2014
## Max. :471.0 Max. :2018
##
## area_in_1k_ha area_in_1k_sq_km production_in_1k_mt
## Min. : 27.2 Min. : 0.272 Min. : 243.8
## 1st Qu.: 161.5 1st Qu.: 1.615 1st Qu.: 1382.0
## Median : 319.2 Median : 3.192 Median : 3667.9
## Mean : 643.0 Mean : 6.430 Mean : 6982.1
## 3rd Qu.: 740.0 3rd Qu.: 7.400 3rd Qu.: 8731.0
## Max. :5909.0 Max. :59.090 Max. :51310.0
## NA's :4 NA's :4 NA's :4
## productivity_in_mt_per_ha
## Min. : 0.60
## 1st Qu.: 7.35
## Median :11.10
## Mean :13.85
## 3rd Qu.:18.85
## Max. :44.20
## NA's :4
The plot shows change in area of major crops types over time. Except spices all the crops almost have been mostly increasing indicating all the crops are being more planted. The species have a odd rise and fall in area.
## Warning: Removed 2 rows containing missing values (`geom_line()`).
The plot shows change in production of major crops types over time. All
the crops almost have been mostly increasing indicating all the crops
are being more produced.
## Warning: Removed 2 rows containing missing values (`geom_line()`).
The plot shows change in productivity of different crops. There is no much change in spices and plantation crops. Where as there is constant change in Fruits and vegetables.
The plot shows total area in different years and there is a clear increasing trend, indicating that there is more cultivation being done.
The plot shows total production in different years and there is a clear increasing trend, indicating that production of crops is almost always increasing.
The plot shows total productivity in different years and there is a increasing trend, indicating the productivity of crops is increasing or yield efficiency is getting better.
The plot shows total area of top 8 crops. Spices are being cultivated at larger area.
The plot shows total production of top 8 crops. Potato is produced more, followed by banana.
The plot shows total productivity of top 8 crops. Banana is the crop with top productivity.
The plot shows area of top 5 crops in 2018. Spices is the top crop here.
The plot shows production of top 5 crops in 2018. Potato is the top crop here.
The plot shows productivity of top 5 crops in 2018. Papaya is the top crop here.
The plot shows the change in production, measured in thousands of metric tons (MT), for the crops that had the smallest and largest increase in production between 2002 and 2017. The x-axis shows the crop names, and the y-axis shows the change in production.
Year 2002 was the first year with sufficient data in the dataset and year 2017 was the last year with sufficient data in the dataset.
The bars are colored differently to indicate whether the change in production was positive or negative. The bars in green represent crops that had an increase in production, while the bars in red represent crops that had a decrease in production.
The plot clearly shows that the crop with the smallest increase in production was ‘Tapioca’, with a change of -2345.1 thousand MT, while the crop with the largest increase in production was ‘Potato’, with a change of 24148.5 thousand MT. This indicates that ‘Potato’ was the most successful crop in terms of increasing production between 2002 and 2017, while ‘Tapioca’ was the crop was not successful. Tapioca has a negative increase, meaning that production has decreased rather than increased.
Overall, the Production Change (in 1k MT) for Crops with Smallest and Largest Increase in Production (2002-2017) plot provides a clear and concise summary of the changes in production for the different crops, and allows for easy comparison between the crops that had the smallest and largest increase in production.
(Table 7.1.2)
The Horticulture Statistics at a glance 2018 dataset provides information related to horticulture crops in India, including the type of crop, its name, area under cultivation, production in metric tons, and the year in which the data was recorded. The data in the dataset is available for multiple years, which allows for trend analysis and comparison of crop performance over time.The dataset provides valuable insights into the state of horticulture crops in India, such as the most popular crops, the areas where they are grown, and their production levels. Horticulture Statistics at a glance 2018 dataset is a rich source of information on horticulture crops in India, and its analysis can provide useful insights into the current state and future prospects of the horticulture sector in the country.
summary(ht_data)
## id table_num table_name year
## Min. : 1.00 Length:312 Length:312 Min. :2015
## 1st Qu.: 78.75 Class :character Class :character 1st Qu.:2016
## Median :156.50 Mode :character Mode :character Median :2016
## Mean :156.50 Mean :2016
## 3rd Qu.:234.25 3rd Qu.:2017
## Max. :312.00 Max. :2018
##
## crop_type crop_name area_in_1k_ha area_in_1k_sq_km
## Length:312 Length:312 Min. : 1.00 Min. : 0.0000
## Class :character Class :character 1st Qu.: 44.75 1st Qu.: 0.3975
## Mode :character Mode :character Median : 131.50 Median : 1.2300
## Mean : 323.73 Mean : 3.1128
## 3rd Qu.: 358.25 3rd Qu.: 3.2175
## Max. :2258.00 Max. :22.5800
## NA's :12
## production_in_1k_mt production_in_10lakh_mt student_name
## Min. : 0 Mode:logical Length:312
## 1st Qu.: 197 NA's:312 Class :character
## Median : 1056 Mode :character
## Mean : 3782
## 3rd Qu.: 2844
## Max. :51310
## NA's :3
This graph provides an overview of the distribution of horticulture crop production levels in the dataset, allowing the viewer to see which production levels are more or less common.
The graph is a histogram showing the distribution of horticulture crop production in units of 1000 metric tons. The x-axis represents the range of production levels, while the y-axis shows the count or frequency of each level. The highest production count was from 0-10000.
In this graph x-axis label to “Area in 1000 Ha”, the y-axis label to “Count”, and the title of the plot to “Distribution of Area”.The histogram displays the distribution of the variable area_in_1k_ha in the dataset ht_data_filtered. The x-axis represents the area in thousands of hectares, while the y-axis represents the count of observations in each bin. The bars represent the frequency of observations falling into each bin. The average area count is from 0-500 .
The data used for the histogram is a subset of the “ht_data” dataset, filtered for the year 2018. The x-axis displays the crop names, and the y-axis displays the production of each crop in thousands of metric tons. The bars are filled with blue color and their heights correspond to the production values. From the data we can conclude that the highest producing crop in the year 2018 is potato. The production of potato was 48009 production_in_1k_mt. The highest producing fruit was banana and highest producing plantation crop was Coconut.
The data used for the plot is the ht_data, which is filtered to show only the data for the year 2018. The x-axis shows the different crop names, and the y-axis shows the area of each crop in thousands of hectares.The plot provides a visual representation of the area of each crop grown in 2018, allowing for easy comparison between the different crops. The crop that took more area was Mango(2258_in_1k_ha). In plantation crop most area is taken by Coconut and in vegetables onion takes more area comparitivly.
In this graph we are visualizing the top 10 crops in terms of the total area (in 1,000 hectares) grown in the year 2018.The x-axis displays the crop types, while the y-axis shows the total area (in 1,000 hectares) for each crop type. The bars are colored according to the crop name.The distribution of the top 10 crops by area. The height of each bar represents the total area (in 1,000 hectares) for a particular crop type, while the different colors within each bar indicate the contribution of different crop names to the total area. It can also be used to identify trends in the area of cultivation for different crops over time.The crop with highest area is mango(2258_in_1k_ha)
The graph shows the top 10 crops produced in 2018, based on the production volume in thousands of metric tons (1k MT). The crops are represented by their respective names and are color-coded for easy identification. The x-axis shows the crop names, while the y-axis displays the production. The graph highlights the significant differences in production volumes between the top 10 crops, with some crops producing more than twice the volume of others.The highest production crop is potato and the crop with lowest production is cumin.
The x-axis represents the different crop types, while the y-axis represents the total area and production in thousands of hectares and metric tonnes, respectively.
Each bar is split into two sections, one section representing the total area and the other representing the total production. The area is depicted by the darker shade, while the production is depicted by the lighter shade.From this graph it is understandable that Area and production of vegetables is more as compared to other crop type.
The data being plotted is ht_data_filtered, and the x-axis represents different crop types, while the y-axis represents the total production in 1,000 metric tons. The resulting diagram will display a bar for each unique crop type in the data set, with the height of each bar representing the total production of that crop type in 1,000 metric tons. Each bar will be filled with a different color to represent the different crop types. The x-axis will be labeled as “Crop Type”, the y-axis will be labeled as “Total Production (in 1k mt)”, and the chart title will be “Total Production by Crop Type”. The crop type with highest area is vegetables. And crop type with lowest area is flowers.
The above graph plot with the x-axis being the crop type and the y-axis being the area in 1,000 hectares. The crop type with highest area is vegetables. And crop type with lowest area is flowers. From this graph it is understandable that vegetable is having highest productivity and area in case of crop-type
The x-axis represents the total production of crops in thousands of metric tons (1k MT), and the y-axis represents the crop names. The bars are colored by crop type and show the relative contribution of each crop type to the total production. The graph is faceted by crop type, which means that each crop type has its own panel. The scales on the y-axis are free, which allows for better comparison between the different crops within each crop type. The title of the graph is “Most Produced Crops by Crop Type,” and the x-axis label is “Total Production (in 1k MT),” while the y-axis label is “Crop Name.” The highest producing vegetable is potato and the highest producing plantation crop is Coconut.
The x-axis represents the crop names, and the y-axis represents the top production of each crop in lakh metric tons (LMT). The bars are colored light green and represent the top 8 crops based on their total production. The title of the graph is “Top Production Crops,” and the x-axis label is “Crop,” while the y-axis label is “Top Production in Lakh MT.” The highest producing crop-type is vegetable.
In conclusion the horticulture data in glance 2018 dataset gives a clear understanding about the area and production of various crop and crop type. The table I choose was 7.1.2. In this table while comparing the highest productivity count it is from 0-1000. The highest area count is from 0-500. And the highest producing crop was potato with the production of 48009 production-in-1k-mt. The highest producing fruit is Banana(29221) and the highest producing plantation crop is Coconut(48009). But while comparing the area of each crop in 2018 by crop name. The crop with highest area is mango(2258-in-1k-ha) and plantation crop with highest area is Coconut. So here we can notice that Coconut is the plantation crop with highest area and production while comparing plantation crop. While categorizing on the basis of crop-type Vegetables(potato especially) have more production as compared to other crop-types. The data visualization helps a lot to get better understanding about the dataset.